Project Description

Introduction

  • Bike Sharing Systems (BSS) is a very widely used method of transportation for many people worldwide.

  • They have several advantages:

    • Non-polluting.
    • Circumvents traffic congestion.
    • Excellent for last-mile connections.
    • Have a convenient payment system.

Motivation

  • BSSs use self-serve bikes locked to docking stations.
  • Bikes are unlocked, used for a trip, then left at the closest station to their destination.
  • Bikes need to be relocated from low demand to high demand stations.

Problem Definition

  • The relocation process could be costly and time consuming.
  • Stations with high demand cannot be identified in advance.
  • Stations with high return rate cannot be identified in advance.
  • Relocation may not be important if return rate is high enough to fill the demand.

Objectives

The objective is to create multivariate regression models capable of estimating the following

  • Demand prediction: Number of bikes needed at each individual station.
  • Return prediction: Number of bikes returned at each individual station.

Background

Bike Sharing vs Bike Renting

  • Bike Renting:
    • Users return bikes to the same location they rented from.
    • Not a mode of transportation, used for leisure.
    • Exclusively a private business.
    • Not self-serve.
  • Bike Sharing:
    • Users return bikes to a different station to the one they rented from.
    • Used primarily as a mode of transportation.
    • Government or institutionally supported.
    • Mostly self-serve.

BSS generations

  • First Generation:
    • Unregulated free bikes, no parking or docking.
    • Failed due to stolen and damaged bikes.

BSS generations

  • First Generation:
    • Unregulated free bikes, no parking or docking.
    • Failed due to stolen and damaged bikes.
  • Second Generation:
    • Docked bikes paid with refundable coins.
    • Still in use in some countries.

BSS generations

  • First Generation:
    • Unregulated free bikes, no parking or docking.
    • Failed due to stolen and damaged bikes.
  • Second Generation:
    • Docked bikes paid with refundable coins.
    • Still in use in some countries.
  • Third Generations:
    • The most common modern BSS.
    • Membership cards or payment through credit card on mobile app.

BSS generations

  • First Generation:
    • Unregulated free bikes, no parking or docking.
    • Failed due to stolen and damaged bikes.
  • Second Generation:
    • Docked bikes paid with refundable coins.
    • Still in use in some countries.
  • Third Generations:
    • The most common modern BSS.
    • Membership cards or payment through credit card on mobile app.
  • Fourth Generation:
    • Bikes have built in locks to prevent theft.
    • No need for docking, just use, finish trip, leave anywhere.

Demand Prediction Models-Models

  • Univariate Time Series:
    • Autoregressive moving Average (ARMA)
    • Autoregressive Integrated Moving Average (ARIMA)
  • Multivariate Time Series:
    • Deep Learning with LSTM
    • Random Forest.
    • Boosting Algorithms.

Demand Prediction Models-Features

  • Demographics:

    • Wealth.
    • Average age.
  • Community

  • Weather

  • Sliding Window Statistics:

    • Standard deviations.
    • Averages.

NB: Weather is especially important. BIXI BSS in Montreal is not operational in unfavorable weather, typically in mid-November to April.

Evaluation Metrics

  • Mean Absolute Error (MAE): Good for calculating exact error value.

\[ \sum_{i=1}^{D}|x_i-y_i| \]

  • Mean Square Error (MSE): Good for penalizing outliers. \[ \sum_{i=1}^{D}(x_i-y_i)^2 \]

  • Root Mean Square Error (RMSE): A more interpretable version of MSE. \[ \sum_{i=1}^{D}\sqrt{(x_i-y_i)^2} \]

Dataset

Lyft/Baywheels

The data was provided by Lyft, the owner of the BSS company Baywheels (formally GoBike) operating in San Francisco Bay Area.

  • The data is available as far as 2017 before Lyft’s acquisition of GoBike.
  • The data used however is limited to 2021 and 2022 because:
    • The disruption caused by the 2020 COVID pandemic.
    • Potential business growth gap between 2019 and 2021.

Weather Data

  • Weather data is very relevant in determining the demand.
  • The weather data for San Francisco Bay Area is provided by Meteostat through a Python API.

Dataset Description

Size:

  • Original Number of Trips: 4744199 trips .
  • Final Number of Trips: 3855197 trips.
  • Number Of Stations: 535
  • Period: From January 2021 to December 2022

Relevant Columns:

  • Station Name (start/end)
  • Time (start/end)
  • Coordinates (start/end)
  • Ride ID

Cleaning Process

Cleaning Process

  • Station Standardization:

    • Stations in a particular street don’t always have the same coordinates.
    • Many stations don’t have a standardized name or ID, making them unidentifiable.
    • Solution To Reduce Data Loss:
      • Use a single coordinate for any identifiable stations.
      • Approximate the closest standard station to any trip.
      • If the closest station is less than 500 meters away, keep the trip, otherwise, drop.

Cleaning Process

  • Same Station Trip:
    • Several trips usually take place in several minutes, with the end station being the same as start stations.
    • This could be a result of users trying out the system or changed minds.
    • To prevent redundancy, any same station trip with duration less than 4 minutes will be removed.

Cleaning Process

  • Clustering:
    • Helps reduce weather data size by:
      • Approximating areas closest to each other.
      • Approximating the weather conditions for each cluster.
    • Separates the areas the BSS serves.

Cleaning Process

Demand and Returns

Definitions

  • Demand: Number of trips starting at a particular station.
  • Returns: Number of trips ending at a particular station.
  • “Returns” does not indicate actual number of bikes available, but it indicates how many bikes finished their trips there, which when compared with demand should provide a good idea about the available bikes.
  • Therefore, demand will be the main focus, while returns will be taken as secondary.

Demand

Demand and Returns

Demand and Returns

Demand Spread

Demand Spread

Demand Spread

References

  • Susan Shaheen, Stacey Guzman, and Hua Zhang. Bikesharing in europe, the americas, and asia. Transportation Research Record, pages 159–167, 1 2010.
  • Andreas Nikiforiadis, Katerina Chrysostomou, and Georgia Aifadopoulou. Exploring travelers’ characteristics affecting their intention to shift to bike-sharing systems due to a sophisticated mobile app. Algorithms, 12:264, 12 2019.
  • Jung-Hoon Cho, Young-Hyun Seo, and Dong-Kyu Kim. Efficiency comparison of public bike-sharing repositioning strategies based on predicted demand patterns. Transportation Research Record: Journal of the Trans-portation Research Board, 2675:104–118, 11 2021.
  • Aliasghar Mehdizadeh Dastjerdi and Catherine Morency. Bike-sharing demand prediction at community level under covid-19 using deep learning. Sensors, 22:1060, 1 2022.
  • Bike share in the san francisco bay area — bay wheels — lyft.
  • Ahmed Ghanem, Hesham A. Rakha, and Leanna House. Modeling bike availability in a bike-sharing system using machine learning. pages 374–378. IEEE, 6 2017.
  • Andreas Kaltenbrunner, Rodrigo Meza, Jens Grivolla, Joan Codina, and Rafael Banchs. Urban cycles and mobility patterns: Exploring and predicting trends in a bicycle-based public transport system. Pervasive and Mobile Computing, 6:455–466, 8 2010.
  • Alvaro Lozano, Juan De Paz, Gabriel Villarrubia Gonz ́alez, Daniel Iglesia, and Javier Bajo. Multi-agent system for demand prediction and trip visualization in bike sharing systems. Applied Sciences, 8:67, 1 2018.

  • R. Alexander Rixey. Station-level forecasting of bikesharing ridership. Transportation Research Record: Journal of the Transportation Research Board, 2387:46–55, 1 2013.
  • Young-Hyun Seo, Sangwon Yoon, Dong-Kyu Kim, Seung-Young Kho, and Jaemin Hwang. Predicting demand for a bike-sharing system with station activity based on random forest. Proceedings of the Institution of Civil Engineers - Municipal Engineer, 174:97–107, 6 2021

Thank you!